AITopics | abstractive text summarization

Collaborating Authors

abstractive text summarization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InforME: Improving Informativeness of Abstractive Text Summarization With Informative Attention Guided by Named Entity Salience

Shen, Jianbin, Liang, Christy Jie, Xuan, Junyu

arXiv.org Artificial IntelligenceOct-8-2025

Abstractive text summarization is integral to the Big Data era, which demands advanced methods to turn voluminous and often long text data into concise but coherent and informative summaries for efficient human consumption. Despite significant progress, there is still room for improvement in various aspects. One such aspect is to improve informativeness. Hence, this paper proposes a novel learning approach consisting of two methods: an optimal transport-based informative attention method to improve learning focal information in reference summaries and an accumulative joint entropy reduction method on named entities to enhance informative salience. Experiment results show that our approach achieves better ROUGE scores compared to prior work on CNN/Daily Mail while having competitive results on XSum. Human evaluation of informativeness also demonstrates the better performance of our approach over a strong baseline. Further analysis gives insight into the plausible reasons underlying the evaluation results.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.05769

Country:

Asia (0.93)
Europe > United Kingdom (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Sports (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Split-then-Join Approach to Abstractive Summarization for Very Long Documents in a Low Resource Setting

Fazry, Lhuqita

arXiv.org Artificial IntelligenceMay-13-2025

$\texttt{BIGBIRD-PEGASUS}$ model achieves $\textit{state-of-the-art}$ on abstractive text summarization for long documents. However it's capacity still limited to maximum of $4,096$ tokens, thus caused performance degradation on summarization for very long documents. Common method to deal with the issue is to truncate the documents. In this reasearch, we'll use different approach. We'll use the pretrained $\texttt{BIGBIRD-PEGASUS}$ model by fine tuned the model on other domain dataset. First, we filter out all documents which length less than $20,000$ tokens to focus on very long documents. To prevent domain shifting problem and overfitting on transfer learning due to small dataset, we augment the dataset by splitting document-summary training pair into parts, to fit the document into $4,096$ tokens. Source code available on $\href{https://github.com/lhfazry/SPIN-summ}{https://github.com/lhfazry/SPIN-summ}$.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.06862

Country:

Asia > Indonesia (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Advancements in Natural Language Processing for Automatic Text Summarization

Jayatilleke, Nevidu, Weerasinghe, Ruvan, Senanayake, Nipuna

arXiv.org Artificial IntelligenceFeb-27-2025

The substantial growth of textual content in diverse domains and platforms has led to a considerable need for Automatic Text Summarization (ATS) techniques that aid in the process of text analysis. The effectiveness of text summarization models has been significantly enhanced in a variety of technical domains because of advancements in Natural Language Processing (NLP) and Deep Learning (DL). Despite this, the process of summarizing textual information continues to be significantly constrained by the intricate writing styles of a variety of texts, which involve a range of technical complexities. Text summarization techniques can be broadly categorized into two main types: abstractive summarization and extractive summarization. Extractive summarization involves directly extracting sentences, phrases, or segments of text from the content without making any changes. On the other hand, abstractive summarization is achieved by reconstructing the sentences, phrases, or segments from the original text using linguistic analysis. Through this study, a linguistically diverse categorizations of text summarization approaches have been addressed in a constructive manner. In this paper, the authors explored existing hybrid techniques that have employed both extractive and abstractive methodologies. In addition, the pros and cons of various approaches discussed in the literature are also investigated. Furthermore, the authors conducted a comparative analysis on different techniques and matrices to evaluate the generated summaries using language generation models. This survey endeavors to provide a comprehensive overview of ATS by presenting the progression of language processing regarding this task through a breakdown of diverse systems and architectures accompanied by technical and mathematical explanations of their operations.

encoder, summarization, text summarization, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICCS62594.2024.10795848

2502.19773

Country:

Asia > Singapore (0.04)
Asia > Sri Lanka (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

Abstractive Text Summarization for Bangla Language Using NLP and Machine Learning Approaches

Miazee, Asif Ahammad, Roy, Tonmoy, Islam, Md Robiul, Safat, Yeamin

arXiv.org Artificial IntelligenceJan-24-2025

Text summarization involves reducing extensive documents to short sentences that encapsulate the essential ideas. The goal is to create a summary that effectively conveys the main points of the original text. We spend a significant amount of time each day reading the newspaper to stay informed about current events both domestically and internationally. While reading newspapers enriches our knowledge, we sometimes come across unnecessary content that isn't particularly relevant to our lives. In this paper, we introduce a neural network model designed to summarize Bangla text into concise and straightforward paragraphs, aiming for greater stability and efficiency.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.15051

Country:

North America > United States > Utah (0.05)
North America > United States > Virginia (0.04)
North America > United States > Iowa (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre:

Overview (0.69)
Research Report (0.64)

Industry: Media > News (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback

Survey of Pseudonymization, Abstractive Summarization & Spell Checker for Hindi and Marathi

Ransing, Rasika, Dhamaskar, Mohammed Amaan, Rajpurohit, Ayush, Dhoke, Amey, Dalvi, Sanket

arXiv.org Artificial IntelligenceDec-23-2024

India's vast linguistic diversity presents unique challenges and opportunities for technological advancement, especially in the realm of Natural Language Processing (NLP). While there has been significant progress in NLP applications for widely spoken languages, the regional languages of India, such as Marathi and Hindi, remain underserved. Research in the field of NLP for Indian regional languages is at a formative stage and holds immense significance. The paper aims to build a platform which enables the user to use various features like text anonymization, abstractive text summarization and spell checking in English, Hindi and Marathi language. The aim of these tools is to serve enterprise and consumer clients who predominantly use Indian Regional Languages.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.18163

Country:

Asia > India (0.55)
North America > Canada (0.47)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Survey on Abstractive Text Summarization: Dataset, Models, and Metrics

Nnadi, Gospel Ozioma, Bertini, Flavio

arXiv.org Artificial IntelligenceDec-22-2024

Readers and scholars often desire a concise summary (Too Long; Didn't Read - TL;DR) of texts to effectively prioritize information. However, creating document summaries is mentally taxing and time-consuming, especially considering the overwhelming volume of documents produced annually, as depicted in Figure 1 by [2], Figure 2, [3] reported over 100,000 scientific articles on the Corona virus pandemic in 2020, though these articles contain brief abstracts of the article, the sheer volume poses challenges for researchers and medical professionals in quickly extracting relevant knowledge on a specific topic. An automatically generated multi-document summarization could be valuable, providing readers with essential information and reducing the need to access original files unless refinement is necessary. Text summarization has garnered significant research attention, proving useful in search engines, news clustering, timeline generation, and various other applications. The objective of text summarization is to create a brief, coherent, factually consistent, and readable document that retains the essential information from the source document, whether it is a single or multi-document. In Single Document Summarization (SDS) only one input document is used, eliminating the need for additional processing to assess relationships between inputs. This method is suitable for summarizing standalone documents such as emails, legal contracts, financial reports and so on. The primary goal of Multi Document Summarization (MDS) is to gather information from several texts addressing the same topic, often composed at different times or representing diverse perspectives. The overarching objective is to produce information reports that are both succinct and comprehensive, consolidating varied opinions from documents that explore a topic through multiple viewpoints.

evolutionary algorithm, information retrieval, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.17165

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(2 more...)

Add feedback

L3Cube-MahaSum: A Comprehensive Dataset and BART Models for Abstractive Text Summarization in Marathi

Deshmukh, Pranita, Kulkarni, Nikita, Kulkarni, Sanhita, Manghani, Kareena, Joshi, Raviraj

arXiv.org Artificial IntelligenceOct-11-2024

We present the MahaSUM dataset, a large-scale collection of diverse news articles in Marathi, designed to facilitate the training and evaluation of models for abstractive summarization tas ks in Indic languages. The dataset, containing 25k samples, was create d by scraping articles from a wide range of online news sources and manuall y verifying the abstract summaries. Additionally, we train an IndicBAR T model, a variant of the BART model tailored for Indic languages, usin g the Maha-SUM dataset. We evaluate the performance of our trained mode ls on the task of abstractive summarization and demonstrate their eff ectiveness in producing high-quality summaries in Marathi. Our work cont ributes to the advancement of natural language processing research in Indic languages and provides a valuable resource for future research in this area using state-of-the-art models.

dataset, marathi, summarization, (13 more...)

arXiv.org Artificial Intelligence

2410.09184

Country:

Asia > India > Maharashtra > Pune (0.05)
Asia > India > Tamil Nadu > Chennai (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Education (0.46)
Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Abstractive Text Summarization: State of the Art, Challenges, and Improvements

Shakil, Hassan, Farooq, Ahmad, Kalita, Jugal

arXiv.org Artificial IntelligenceSep-3-2024

Specifically focusing on the landscape of abstractive text summarization, as opposed to extractive techniques, this survey presents a comprehensive overview, delving into state-of-the-art techniques, prevailing challenges, and prospective research directions. We categorize the techniques into traditional sequence-to-sequence models, pre-trained large language models, reinforcement learning, hierarchical methods, and multi-modal summarization. Unlike prior works that did not examine complexities, scalability and comparisons of techniques in detail, this review takes a comprehensive approach encompassing state-of-the-art methods, challenges, solutions, comparisons, limitations and charts out future improvements - providing researchers an extensive overview to advance abstractive summarization research. We provide vital comparison tables across techniques categorized - offering insights into model complexity, scalability and appropriate applications. The paper highlights challenges such as inadequate meaning representation, factual consistency, controllable text summarization, cross-lingual summarization, and evaluation metrics, among others. Solutions leveraging knowledge incorporation and other innovative strategies are proposed to address these challenges. The paper concludes by highlighting emerging research areas like factual inconsistency, domain-specific, cross-lingual, multilingual, and long-document summarization, as well as handling noisy data. Our objective is to provide researchers and practitioners with a structured overview of the domain, enabling them to better understand the current landscape and identify potential areas for further research and improvement.

abstractive text summarization, summarization, text summarization, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neucom.2024.128255

2409.02413

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > India > NCT > New Delhi (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(13 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Add feedback

HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew

Paz-Argaman, Tzuf, Mondshine, Itai, Mordechai, Asaf Achi, Tsarfaty, Reut

arXiv.org Artificial IntelligenceJun-10-2024

While large language models (LLMs) excel in various natural language tasks in English, their performance in lower-resourced languages like Hebrew, especially for generative tasks such as abstractive summarization, remains unclear. The high morphological richness in Hebrew adds further challenges due to the ambiguity in sentence comprehension and the complexities in meaning construction. In this paper, we address this resource and evaluation gap by introducing HeSum, a novel benchmark specifically designed for abstractive text summarization in Modern Hebrew. HeSum consists of 10,000 article-summary pairs sourced from Hebrew news websites written by professionals. Linguistic analysis confirms HeSum's high abstractness and unique morphological challenges. We show that HeSum presents distinct difficulties for contemporary state-of-the-art LLMs, establishing it as a valuable testbed for generative language technology in Hebrew, and MRLs generative challenges in general.

computational linguistic, hebrew, summarization, (12 more...)

arXiv.org Artificial Intelligence

2406.03897

Country:

Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Improving Sequence-to-Sequence Models for Abstractive Text Summarization Using Meta Heuristic Approaches

Saxena, Aditya, Ranjan, Ashutosh

arXiv.org Artificial IntelligenceMar-24-2024

As human society transitions into the information age, reduction in our attention span is a contingency, and people who spend time reading lengthy news articles are decreasing rapidly and the need for succinct information is higher than ever before. Therefore, it is essential to provide a quick overview of important news by concisely summarizing the top news article and the most intuitive headline. When humans try to make summaries, they extract the essential information from the source and add useful phrases and grammatical annotations from the original extract. Humans have a unique ability to create abstractions. However, automatic summarization is a complicated problem to solve. The use of sequence-to-sequence (seq2seq) models for neural abstractive text summarization has been ascending as far as prevalence. Numerous innovative strategies have been proposed to develop the current seq2seq models further, permitting them to handle different issues like saliency, familiarity, and human lucidness and create excellent synopses. In this article, we aimed toward enhancing the present architectures and models for abstractive text summarization. The modifications have been aimed at fine-tuning hyper-parameters, attempting specific encoder-decoder combinations. We examined many experiments on an extensively used CNN/DailyMail dataset to check the effectiveness of various models.

algorithm, summarization, text summarization, (13 more...)

arXiv.org Artificial Intelligence

2403.16247

Country:

Asia > Myanmar (0.04)
Asia > India > NCT > Delhi (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback